Fully Dynamic Inference with Deep Neural Networks

نویسندگان

چکیده

Modern deep neural networks are powerful and widely applicable models that extract task-relevant information through multi-level abstraction. Their cross-domain success, however, is often achieved at the expense of computational cost, high memory bandwidth, long inference latency, which prevents their deployment in resource-constrained time-sensitive scenarios, such as edge-side self-driving cars. While recently developed methods for creating efficient making real-world more feasible by reducing model size, they do not fully exploit input properties on a per-instance basis to maximize efficiency task accuracy. In particular, most existing typically use one-size-fits-all approach identically processes all inputs. Motivated fact different images require feature embeddings be accurately classified, we propose dynamic paradigm imparts convolutional with hierarchical dynamics level layers individual filters/channels. Two compact networks, called Layer-Net (L-Net) Channel-Net (C-Net), predict or filters/channels redundant therefore should skipped. L-Net C-Net also learn how scale retained computation outputs By integrating into joint design framework, LC-Net, consistently outperform state-of-the-art frameworks respect both classification On CIFAR-10 dataset, LC-Net results up $11.9\times$11.9× fewer floating-point operations (FLOPs) 3.3 percent higher accuracy compared other methods. ImageNet achieves notation="LaTeX">$1.4\times$1.4× FLOPs 4.6 Top-1 than

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Training and Inference with Integers in Deep Neural Networks

Researches on deep neural networks with discrete parameters and their deployment in embedded systems have been active and promising topics. Although previous works have successfully reduced precision in inference, transferring both training and inference processes to low-bitwidth integers has not been demonstrated simultaneously. In this work, we develop a new method termed as “WAGE” to discret...

متن کامل

DyVEDeep: Dynamic Variable Effort Deep Neural Networks

Deep Neural Networks (DNNs) have advanced the state-of-the-art in a variety of machine learning tasks and are deployed in increasing numbers of products and services. However, the computational requirements of training and evaluating large-scale DNNs are growing at a much faster pace than the capabilities of the underlying hardware platforms that they are executed upon. In this work, we propose...

متن کامل

Impatient DNNs - Deep Neural Networks with Dynamic Time Budgets

We propose Impatient Deep Neural Networks (DNNs) which deal with dynamic time budgets during application. They allow for individual budgets given a priori for each test example and for anytime prediction, i.e. a possible interruption at multiple stages during inference while still providing output estimates. Our approach can therefore tackle the computational costs and energy demands of DNNs in...

متن کامل

Efficient Stochastic Inference of Bitwise Deep Neural Networks

Recently published methods enable training of bitwise neural networks which allow reduced representation of down to a single bit per weight. We present a method that exploits ensemble decisions based on multiple stochastically sampled network models to increase performance figures of bitwise neural networks in terms of classification accuracy at inference. Our experiments with the CIFAR-10 and ...

متن کامل

Fully Connected Deep Structured Networks

Convolutional neural networks with many layers have recently been shown to achieve excellent results on many high-level tasks such as image classification, object detection and more recently also semantic segmentation. Particularly for semantic segmentation, a two-stage procedure is often employed. Hereby, convolutional networks are trained to provide good local pixel-wise features for the seco...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: IEEE Transactions on Emerging Topics in Computing

سال: 2021

ISSN: ['2168-6750', '2376-4562']

DOI: https://doi.org/10.1109/tetc.2021.3056031